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PART I 


Answer the following multiple choice questions (Questions 1-20) on the given 
MULTIPLE CHOICE ANSWER SHEET. Use a 2B pencil to mark the correct 


answer. 


Question 1 [1 Mark] 


Which one of these variables is a continuous random variable? 


1. The number of students who score above 73.5% in an exam. 

2. The time it takes a randomly selected student to complete an exam. 

3. The number of women taller than 170 cm in a random sample of 5 women. 
4. The number of tattoos a randomly selected person has. 


5. The number of correct guesses on a multiple choice test. 


Question 2 [1 Mark] 


A randomly selected sample of 1,000 college students was asked whether they had ever 
used the drug Ecstasy. Sixteen percent (16% or 0.16) of the 1,000 students surveyed 
said they had. Which one of the following statements about the number 0.16 is correct? 


1. It is a sample proportion. 

2. It is a population proportion. 

3. It is a margin of error. 

4. It is a randomly chosen number. 


5. None of the above. 
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Question 3 [1 Mark] 


Pulse rates of adult men are approximately normal with a mean of 70 and a standard 
deviation of 8. Which choice correctly describes how to find the proportion of men 
that have a pulse rate greater than 78? 


1. Find the area to the left of z = 1 under a standard normal curve. 
2. Find the area between z = —1 and z = 1 under a standard normal curve. 
3. Find the area to the right of z = 1 under a standard normal curve. 


4. Find the area to the right of z = —1 under a standard normal curve. 


5. None of the above. 


Question 4 [1 Mark] 


Which one of the following probabilities is a “cumulative” probability? 


1. The probability that there are exactly 4 people with Type O-+ blood in a sample 
of 10 people. 


2. The probability of exactly 3 heads in 6 flips of a coin. 


3. The probability that the accumulated annual rainfall in a certain city next year, 
rounded to the nearest inch, will be 18 inches. 


4. The probability that a randomly selected woman’s height is 67 inches or less. 


5. None of the above. 


Question 5 [1 Mark] 


A medical treatment has a success rate of 0.8. Two patients will be treated with this 
treatment. Assuming the results are independent for the two patients, what is the 
probability that neither one of them will be successfully cured? 


1. 0.5 
2. 0.36 
3. 0.2 
4. 0.04 
5. 0.64 
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This scenario applies to Questions 6 to 8: A randomized experiment was done 
by randomly assigning each participant either to walk for half an hour three times a 
week or to sit quietly reading a book for half an hour three times a week. At the end 
of a year the change in participants’ blood pressure over the year was measured, and 
the change was compared for the two groups. 


Question 6 [1 Mark] 


This is a randomized experiment rather than an observational study because: 


1. Blood pressure was measured at the beginning and end of the study. 
2. The two groups were compared at the end of the study. 


3. The participants were randomly assigned to either walk or read, rather than 
choosing their own activity. 


4. A random sample of participants was used. 


5. None of the above. 


Question 7 [1 Mark] 


The two treatments in this study were: 


1. Walking for half an hour three times a week and reading a book for half an hour 
three times a week. 


2. Having blood pressure measured at the beginning of the study and having blood 
pressure measured at the end of the study. 


3. Walking or reading a book for half an hour three times a week and having blood 
pressure measured. 


4. Walking or reading a book for half an hour three times a week and doing nothing. 


5. None of the above. 
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Question 8 [1 Mark] 


If a statistically significant difference in blood pressure change at the end of a year for 


the two activities was found, then: 


. It cannot be concluded that the difference in activity caused a difference in the 


change in blood pressure because in the course of a year there are lots of possible 
confounding variables. 


. It can be concluded that the difference in activity caused a difference in the 


change in blood pressure because of the way the study was conducted. 


. It cannot be concluded that the difference in activity caused a difference in the 


change in blood pressure because it might be the opposite, that people with high 
blood pressure were more likely to read a book than to walk. 


. Whether or not the difference was caused by the difference in activity depends 


on what else the participants did during the year. 


. None of the above. 


Question 9 [1 Mark] 


Null and alternative hypotheses are statements about: 


oo FF WO WN 


. population parameters. 
. sample parameters. 

. sample statistics. 

. population statistics. 


. it depends - sometimes parameters and sometimes statistics. 


Question 10 [1 Mark] 


Which of the following is not a correct way to state a null hypothesis? 


Or = G0 ak 


Ho: pi — po = 0 
Ho : ta = 10 
Ho : fa - 2 = 0 
p05 
He =p = 0 
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Question 11 [1 Mark] 


In a survey, a random sample of men and women answered the question Are you a 
member of any sports clubs? Based on the sample data, 95% confidence intervals for 
the population proportion who would answer yes are 0.13 to 0.19 for women and 0.247 
to 0.33 for men. Based on these results, you can reasonably conclude that 


1. At least 25% of men and women belong to sports clubs. 
2. At least 16% of women belong to sports clubs. 


3. There is a difference between the proportions of men and women who belong to 
sports clubs. 


4. There is no conclusive evidence of a gender difference in the proportion belonging 
to sports clubs. 


5. None of the above. 


This scenario applies to Questions 12 to 14: A survey asked people how often 
they exceed speed limits. The data are then categorized into the following contingency 
table of counts showing the relationship between age group and response. 


Exceed Limit if Possible? 
Age Always Not Always Total 


Under 30 | 100 100 200 
Over 30 40 160 200 
Total 140 260 400 
Question 12 [1 Mark] 


Among people with age over 30, what’s the “risk” of always exceeding the speed limit? 


1. 0.20 
2. 0.40 
3. 0.33 
4. 0.50 
5. 0.15 
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Question 13 [1 Mark] 


Among people with age under 30 what are the odds that they always exceed the speed 
limit? 


le Die 2 
22 2 bOrL 
3. 1 tol 
4. 50% 

5. 100% 


Question 14 [1 Mark] 


What is the relative risk of always exceeding the speed limit for people under 30 com- 
pared to people over 30? 


1. 2.5 
0.4 
0.5 
30% 


OF oe A IS 


. None of the above. 
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Question 15 [1 Mark] 
It is hypothesized that the the numbers of a particular organism found in 100 samples 
of water from a pond will follow a Poisson distribution with a mean of 1.9. 


Refer to Figure 1 to answer the following question. 


The expected proportion of samples out of 100 that would contain exactly two of that 
particular organism would be: 


5. 0.57 


Poisson distribution function 


Pek) 


0.45 0.20 0.25 0.30 0.35 0.40 0.45 0.50 0.55 0.60 0.65 0.70 0.75 080 0.85 0.90 0.95 1.00 


Figure 1: Poisson Distribution Function: X ~ Pois(u = 1.9) 
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Question 16 [1 Mark] 


A chi-square test of the relationship between personal perception of emotional health 
and marital status led to rejection of the null hypothesis, indicating that there is a 
relationship between these two variables. One conclusion that can be drawn is: 

1. Marriage leads to better emotional health. 

2. Better emotional health leads to marriage. 

3. The more emotionally healthy someone is, the more likely they are to be married. 


4. There are likely to be confounding variables related to both emotional health and 
marital status. 


5. None of the above. 


Question 17 [1 Mark] 


A chi-square test involves a set of counts called expected counts. What are the expected 
counts? 

1. Hypothetical counts that would occur if the alternative hypothesis were true. 

2. Hypothetical counts that would occur if the null hypothesis were true. 

3. The actual counts that did occur in the observed data. 


4. The long-run counts that would be expected if the observed counts are represen- 
tative. 


5. None of the above. 


Question 18 [1 Mark] 


A sampling distribution is the probability distribution for which one of the following: 


1. A sample. 

2. A sample statistic. 

3. A population. 

4. A population parameter. 


5. None of the above. 
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Question 19 [1 Mark] 


Which statement is not true about confidence intervals? 


1. A confidence interval is an interval of values computed from sample data that is 
likely to include the true population value. 


2. An approximate formula for a 95% confidence interval is 
sample estimate + margin of error. 


3. A confidence interval between 20% and 40% means that the population propor- 
tion lies between 20% and 40%. 


4. A 99% confidence interval procedure has a higher probability of producing inter- 
vals that will include the population parameter than a 95% confidence interval 
procedure. 


5. None of the above. 


Question 20 [1 Mark] 


Which of the following examples involves paired data? 


1. A study compared the average number of courses taken by a random sample 
of 100 freshmen at a university with the average number of courses taken by a 
separate random sample of 100 freshmen at a community college. 


2. A group of 100 students were randomly assigned to receive vitamin C (50 stu- 
dents) or a placebo (50 students). The groups were followed for 2 weeks and the 
proportions with colds were compared. 


3. A group of 50 students had their blood pressures measured before and after 
watching a movie containing violence. The mean blood pressure before the movie 
was compared with the mean pressure after the movie. 


4. A random sample of 50 female students and a random sample of 50 male students 
had their blood pressures measured at the start of a lecture. The mean blood 
pressure was compared for the two groups. 


5. None of the above. 
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PART II 


Answer the following questions in the space provided. 
If you run out of space you may use the answer book provided. 


Question 21 [4 Marks] 


A survey showed that among 785 randomly selected subjects who had completed 3 
years of tertiary education, 18.3% smoke. 


(a) Construct a 95% confidence interval for the true percentage of smokers among 
all people who have completed 3 years of tertiary education. [2 Marks] 


95% confidence interval: 


(b) With reference to your result in (a), can you conclude that the smoking rate for 
those with 3 years of tertiary education is different from the 27% rate among the 
general population? Explain your response. [2 Marks] 
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Question 22 [7 Marks] 


An experiment was performed to compare the mean time required for bodily absorption 
of two brands (A and B) of headache remedies. Twelve people were randomly selected 
and given an oral dose of Brand A, and another 12 were randomly selected and given an 
equal dose of Brand B. The length of time in minutes for the drugs to reach a specified 
level in the blood were recorded. The results are summarized below in Table 1. 


Table 1: 


mean sd n 
A 21.8 8.7 12 
B 18.9 7.5 12 


Welch Two Sample t-test 
data: time t = 0.875, df = 21.5, p-value = 0.392 


alternative hypothesis: true difference in means is not equal to 0 


95 percent confidence interval: -4.0 9.8 


(a) State, in words and using statistical notation, the null hypothesis for the test in 
Table 1. 
[2 Marks] 


Ao: 


Question 22 is continued on page 13 
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(b) With reference to the output in Table 1, test the relevant hypothesis, giving an 
informative conclusion. [3 Marks] 


Results: 


Conclusion: 


(c) Briefly outline how the design and analysis of the experiment would differ if this 
had been conducted using paired samples. (No calculations are necessary.) 
[2 Marks] 
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Question 23 [7 Marks] 
In order to recover after knee surgery, a patient trains on a low-impact elliptical exercise 


machine. The patient’s pulse is measured versus the speed of the machine. The data 
follow: 


Speed, kph Pulse, bpm 


0 57 
1.6 69 
3.1 78 

4 80 

5 85 

6 87 
6.9 90 
it 92 
8.7 97 
12.4 108 
15.3 119 


R output for the simple linear regression is given below: 


Call: 


lm(formula = pulse ~ speed) 
Coefficients: 

Estimate Std. Error t value Pr(>|t]) 
(Intercept) 63.3568 1.5092 41.98 1.23e-11 **x 
speed 3.7493 0.1949 19.24 1.28e-08 *** 


Signif. codes: O *** 0.001 ** 0.01 * 0.05 . 0.1 1 


Residual standard error: 2.792 on 9 degrees of freedom 
Multiple R-squared: 0.9763, Adjusted R-squared: 0.9736 
F-statistic: 370.1 on 1 and 9 DF, p-value: 1.279e-08 


Question 23 is continued on page 15 
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(a) Add axis labels to the plot below. Be sure to put the explanatory variable on the 


x axis and the response variable on the y axis. Draw an approximate regression 
line. 


[1 Mark] 


(b) A number of conditions must be met for a regression to be valid. One of those 
conditions is that the residuals are independent from each other. In the space be- 
low, sketch an example where the independence assumption is not met. /1 Mark] 


Residual 
(ep) 


Predicted Value 
Question 23 is continued on page 16 
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For questions (c) through (f), assume that all regression assumptions have been 
met. 


(c) Write the estimated regression equation. [1 Mark] 


Equation: 


(d) Predict the patient’s pulse when the speed of the elliptical exercise machine 
reaches 25 kph, and explain why this prediction might not be trustworthy. 
[2 Marks] 


Predicted pulse when speed = 25 kph: 


Explanation why the prediction might not be trustworthy: 


Question 23 is continued on page 17 
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(e) Calculate the 95% confidence interval for the slope. Use t* = 2.26. [1 Mark] 


Confidence interval: 


(f) Interpret the confidence interval from (e) [1 Mark] 


ile 
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Question 24 [6 Marks] 


Four technicians (A, B, C, D) measure phosphorus in hay (mg of phosphorus per g of 
hay), where each technician is given 5 samples to measure out of a total of 20 samples. 
We wish to determine whether phosphorus concentration differs with the technician 
performing the analysis. The data follow: 


Technician 
A B C D 
34 37 34 36 
36 36 37 «034 
34: 39° “BO (OF 
DO. 81-1 Be 04 


34 37 36 35 


The ANOVA table from R follows: 


Analysis of Variance Table 


Response: Phosphorus 
Df Sum Sq Mean Sq F value Pr(>F) 
Technician 9 2.4 0.1059 


Residuals 16 20 1.25 


(a) Calculate the missing degrees of freedom and Mean Square value for Technician. 
[2 Marks] 


df: 


Mean Sq: 


Question 24 is continued on page 19 
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(b) Shade in the area that corresponds to the p-value. [1 Mark] 
E 
00 05 10 15 20 25 30 35 40 45 50 55 60 
F 
[2 Marks] 


(c) Interpret the p-value in the context of the scientific question. 


(d) Suppose the data exhibit moderate skewness with two mild outliers. Would this 
[1 Mark] 


change your conclusion? Explain why or why not. 
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Question 25 [6 Marks] 


A deadly new disease kills many infected patients within two years. A two-way table 
showing two-year survival (Yes or No) versus Physician Experience (the level of experi- 
ence the treating physician has: Least, Moderate, Most) and the result of a Chi-Square 
test are shown below for n = 403 patients. 


Survival 
Experience Yes No 
Least 38 69 
Moderate 61 99 
Most 70 66 


Pearson’s Chi-squared test 


data: table 
X-squared = 7.8441, df = ___, p-value = 0.0198 


(a) State (in words) the null and alternative hypotheses for this Chi-Square Test. 
[2 Marks] 


Question 25 is continued on page 21 
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(b) Calculate the contribution to the x? statistic from those patients who were treated 
by physicians who were moderately experienced and did not survive. /1 Mark/ 


(O22 — E29)? - 
Eo 
(c) Find the missing degrees of freedom for this Chi-Square Test. [1 Mark] 
df: 
(d) Interpret the p-value in the context of the scientific question. [2 Marks] 
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Formulae are given on page 23 


Please remember: This examination paper MUST BE HANDED IN. Failure to do so 
may result in the cancellation of all marks for this examination. Writing your name and 
number on the front will help us confirm that your paper has been returned. 
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Formulae 


P(A B 
P(A|B) = ( aaa ) P(A|B) = P(A) if A and B are independent 
P(A and B) = P(A)P(B) if A and B are independent 
ur pb Zp 
Z — Z = 
¢ o//n 
jo 2 se(Z) = = 
vi " 
d— Ld z Sd 
ie se(d) = —= 
Sd_ 
Vn vn 
8 MSE 
zit z — Zj) = 
£ x se(Z) se(Z) Th se(Z;) a 
2 2 
as ESD 2 BN at SI OR 
~ se(%1 — £2) Be\ei= #2) ny ne 
Beater a=) GS ok 
i )=s 1— 2 a ae, maar 


ots (m1 — 1)s} + (n2 — 1)s5 


P ny tng —2 
1 _p 1_ 
prz DL=P) margin of ee P) 
n n 
5 ar Be. Pill—pi) | pol — pe 
(P1 — Pa) + 2" x se(Hr — pa) sh.) = PD Pal = Pe 


_ Ee hp: 
x : E; a NPi 
2 
2 (Oi; — Ej) Ri x C; 
= Aa ETDs E.,- 
as a Ei; J n 
UJ 
by ..;.__ Number in category 
ue se(b1) Risk = Total number in group 


Risk in category 1 
Risk in category 2 


Relative risk = 


ee Odds in category 
~ Odds in category 2 


Odds rati 
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